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(57) ABSTRACT 

A dynamic routing of object requests among a collection or 
cluster of servers factors the caching efficiency of the servers 
and the load balance or just the load balance. The routing 
information on server location can be dynamically updated 
by piggybacking meta information with the request 
response. To improve the cache hit at the server, the server 
selection factors the identifier (e.g. URL) of the object 
requested. A partitioning method can map object identifiers 
into classes; and requester nodes maintain a server assign- 
ment table to map each class into a server selection. The 
class-to-server assignment table can change dynamically as 
the workload varies and also factors the server capacity. The 
requester node need only be informed on an "on-demand" 
basis on the dynamic change of the class-to-server assign- 
ment (and thus reduce communication traffic). In the 
Internet, the collection of servers can be either a proxy or 
Web server cluster and can include a DNS and/or TCP- 
router. The PICS protocol can be used by the server to 
provide the meta information on the "new" class-to-server 
mapping when a request is directed to a server based on an 
invalid or obsolete class-to-server mapping. DNS based 
routing for load balancing of a server cluster can also 
benefit. By piggybacking meta data with the returned object 
to reassign the requester to another server for future 
requests, adverse effects of the TTL on the load balance are 
overcome without increasing traffic. 

75 Claims, 15 Drawing Sheets 
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Abstract Text (1) : 

A dynamic routing of object requests among a collection or cluster of servers 
factors the caching efficiency of the servers and the load balance or just the load 
balance. The routing information on server location can be dynamically updated by 
piggybacking meta information with the request response. To improve the cache hit 
at the server, the server selection factors the identifier (e.g. URL) of the object 
requested. A partitioning method can map object identifiers into classes; and 
requester nodes maintain a server assignment table to map each class into a server 
selection . The class-to -server assignment table can change dynamically as the 
workload varies and also factors the server capacity . The requester node need only 
be informed on an "on-demand" basis on the dynamic change of the class-to -server 
assignment (and thus reduce communication traffic) .. In the Internet, the collection 
of servers can be either a proxy or Web server cluster and can include a DNS and/or 
TCP-router. The PICS protocol can be used by the server to provide the meta 
information on the "new" class-to-server mapping when a request is directed to a 
server based on an invalid or obsolete class-to-server mapping. DNS based routing 
for load balancing of a server cluster can also benefit. By piggybacking meta data 
with the returned object to reassign the requester to another server for future 
requests, adverse effects of the TTL on the load balance are overcome without 
increasing traffic. 

Detailed Description Text (30) : 

In a preferred embodiment, the DNS (167) collects the number of requests issued 
from each requester and will generate a requester-to-server assignment table to 
balance the load among the servers. (For heterogeneous servers, the assigned load 
can be made proportional to the server's processing capacity ) . When a (name-to- 
address) mapping request arrives at the DNS (167) , a server (161 . . . 163) is 
assigned based on the requester name (or IP address) in the assignment table. The 
mapping is hierarchical and multi-level, e.g., URL=>Class=>virtual server=>server . 
The DNS (167) can collect the load statistics and update the assignment table (225) 
based on a measurement interval (much) smaller than the TTL. Thus, a new assignment 
table can be quickly generated, to better reflect load conditions. All servers 
(161 . . . 163) get the up-to-date version of the assignment table (225) from the 
DNS (167) . As before, the requesters (110 . . . 153) need not be informed of the 
change; they can still send requests based on the previous (name-to-address) 
mapping. However, if a server receives a request from a requester that is no longer 
assigned to that server, the server will inform the requester of the server (161 . 
. . 163) to which future requests should be issued. The current request will still 
be served and the new assignment information can be piggybacked, e.g., using PICS 
or a similar mechanism, with the response or returned object. When a server is 
overloaded, it can send an alarm signal to the DNS (167) . Each time an alarm is 
received, the DNS (167) can recalculate the assignment table to reduce the number 
of requesters assigned to any overloaded servers . The requesters can also be 
partitioned into classes so that the assignment table can then become a class-to- 
server assignment . 
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(57) 



ABSTRACT 



A method is provided for load balancing requests for a 
replicated service or application among a plurality of servers 
operating instances of the replicated service or application. 
A policy is selected for choosing a preferred server from the 
plurality of servers according to one or more specified status 
or operational characteristics of the servers, such as the 
least-loaded or closest server. The policy is encapsulated 
within multiple levels of objects or modules that are dis- 
tributed among the servers offering the replicated service 
and a central server that receives requests for the service. 
Status objects gather or retrieve information concerning the 
specified status or operational characteristic^) of each of the 
plurality of servers. An individual server monitor object 
operates for each instance of the replicated service to invoke 
one or more status objects and receive the necessary infor- 
mation. A central replicated monitor object receives the 
information from each individual server monitor object. The 
information from the servers is analyzed to select the server 
having the optimal status or operational characteristics). An 
update object updates the central server, such as a domain 
name server, to indicate the preferred server. Requests for 
the replicated service are then directed to the preferred 
server until a different preferred server is identified. 

30 Claims, 5 Drawing Sheets 
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Abstract Text (1) : 

A method is provided for load balancing requests for a replicated service or 
application among a plurality of servers operating instances of the replicated 
service or application. A policy is selected for choosing a preferred server from 
the plurality of servers according to one or more specified status or operational 
characteristics of the servers, such as the least-loaded or closest server. The 
policy is encapsulated within multiple levels of objects or modules that are 
distributed among the servers offering the replicated service and a central server 
that receives requests for the service. Status objects gather or retrieve 
information concerning the specified status or operational characteristic ( s ) of 
each of the plurality of servers. An individual server monitor object operates for 
each instance of the replicated service to invoke one or more status objects and 
receive the necessary information. A central replicated monitor object receives the 
information from each individual server monitor object. The information from the 
servers is analyzed to select the server having the optimal status or operational 
characteristic ( s) . An update object updates the central server, such as a domain 
name server, to indicate the preferred server. Requests for the replicated service 
are then directed to the preferred server until a different preferred server is 
identified. 
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[57] ABSTRACT 

A method for use in geographically distributed or clustered 
system wherein an arbiter assigns clients to servers. The 
arbiter also dynamically assigns a valid time interval to each 
mapping request based on network load and/or capacity 
parameters such as the client request rate and/or the server 
capacity. Alternative means for dynamically setting the valid 
interval in conjunction with a scheduling process, which can 
be either deterministic or probabilistic, are also devised. 

20 Claims, 9 Drawing Sheets 
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TITLE: Method and apparatus for dynamic interval-based load balancing 



Brief Summary Text (19) : 

A method having features of the present invention can be embodied in a distributed 
or clustered network of servers wherein clients are divided into groups which 
periodically send mapping requests to an arbitrator for mapping and balancing 
service requests among multiple replicated servers which can service the request . 
An example of a computerized method according to the present invention for mapping 
servers to service requests includes the steps of: mapping a first mapping request 
from a first group to a first server according to a scheduling process; dynamically 
computing a valid interval for said mapping request to the first server as a 
function of one of a first group request load and a first server capacity; and 
communicating the server selection and the valid interval to the first group for 
caching such that subsequent requests from the first group are routed to the first 
server during the valid interval. 
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